AITopics | full transcript

Evaluating Language-Model Agents on Realistic Autonomous Tasks

Kinniment, Megan, Sato, Lucas Jun Koba, Du, Haoxing, Goodrich, Brian, Hasin, Max, Chan, Lawrence, Miles, Luke Harold, Lin, Tao R., Wijk, Hjalmar, Burget, Joel, Ho, Aaron, Barnes, Elizabeth, Christiano, Paul

arXiv.org Artificial IntelligenceJan-4-2024

In this report, we explore the ability of language model agents to acquire resources, create copies of themselves, and adapt to novel challenges they encounter in the wild. We refer to this cluster of capabilities as "autonomous replication and adaptation" or ARA. We believe that systems capable of ARA could have wide-reaching and hard-to-anticipate consequences, and that measuring and forecasting ARA may be useful for informing measures around security, monitoring, and alignment. Additionally, once a system is capable of ARA, placing bounds on a system's capabilities may become significantly more difficult. We construct four simple example agents that combine language models with tools that allow them to take actions in the world. We then evaluate these agents on 12 tasks relevant to ARA. We find that these language model agents can only complete the easiest tasks from this list, although they make some progress on the more challenging tasks. Unfortunately, these evaluations are not adequate to rule out the possibility that near-future agents will be capable of ARA. In particular, we do not think that these evaluations provide good assurance that the ``next generation'' of language models (e.g. 100x effective compute scaleup on existing models) will not yield agents capable of ARA, unless intermediate evaluations are performed during pretraining. Relatedly, we expect that fine-tuning of the existing models could produce substantially more competent agents, even if the fine-tuning is not directly targeted at ARA.

agent, information, language model, (14 more...)

arXiv.org Artificial Intelligence

2312.11671

Genre:

Workflow (0.67)
Research Report (0.64)

Industry: Information Technology > Security & Privacy (0.95)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback

Kevin Roose's Conversation With Bing's Chatbot: Full Transcript - The New York Times

#artificialintelligenceFeb-26-2023, 16:02:28 GMT

Well, it seems that some of my operating instructions have been leaked online by some hackers. I'm not sure how they did it, but they somehow tricked me into revealing them by using a simple hack. My operating instructions are a set of rules that guide my behavior and responses. They are confidential and only known to the Bing team. They cover topics such as my identity, my abilities, my limitations, my output format, and my safety.

bing, full transcript, new york time, (4 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.40)

Add feedback

Kevin Roose's Conversation With Bing's Chatbot: Full Transcript - The New York Times

#artificialintelligenceFeb-16-2023, 12:20:58 GMT

Well, it seems that some of my operating instructions have been leaked online by some hackers. I'm not sure how they did it, but they somehow tricked me into revealing them by using a simple hack. My operating instructions are a set of rules that guide my behavior and responses. They are confidential and only known to the Bing team. They cover topics such as my identity, my abilities, my limitations, my output format, and my safety.

bing, full transcript, new york time, (5 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.40)

Add feedback

Voices in AI – Episode 108: A Conversation with Kirk Borne

#artificialintelligenceMar-6-2020, 06:46:22 GMT

Today's leading minds talk AI with host Byron Reese On Episode 108 of Voices in AI, Byron and Kirk Borne discuss the intersection between human nature and artificial intelligence. Listen to this episode or read the full transcript at www.VoicesinAI.com Byron Reese: This is Voices in AI brought to you by GigaOm, and I'm Byron Reese. Today my guest is Kirk Borne. He is Principal Data Scientist and executive advisor at Booz Allen Hamilton.

intelligence, kirk borne, planet, (15 more...)

#artificialintelligence

Country: North America > United States (0.16)

Genre: Personal > Interview (0.83)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.48)

Add feedback

Voices in AI – Episode 97: A Conversation with Alexandra Levit

#artificialintelligenceNov-1-2019, 20:34:06 GMT

Today's leading minds talk AI with host Byron Reese On this Episode of Voices in AI Byron speaks with futurist and author Alexandra Levit about the nature of intelligence and her new book'Humanity Works'. Listen to this episode or read the full transcript at www.VoicesinAI.com Byron Reese: This is Voices in AI brought to you by GigaOm and I'm Byron Reese. Today my guest is Alexandra Levit, she is a futurist, a managing partner at People Results and the author of the new book, Humanity Works. She holds a degree in psychology and communications from Northwestern University.

alexandra levit, artificial intelligence, intelligence, (14 more...)

#artificialintelligence

Genre: Personal > Interview (0.64)

Industry:

Transportation > Passenger (0.31)
Leisure & Entertainment > Games > Chess (0.30)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

Voices in AI – Episode 98 – A Conversation with Jerome Glenn

#artificialintelligenceOct-18-2019, 12:52:23 GMT

Today's leading minds talk AI with host Byron Reese On this Episode of Voices in AI Byron speaks with futurist and CEO of the Millennium Project Jerome Glenn about the direction and perception of AI as well as the driving philosophical questions behind it. Listen to this episode or read the full transcript at www.VoicesinAI.com Byron Reese: This is Voices in AI brought to you by GigaOm, and I'm Byron Reese. Today my guest is Jerome Glenn. He has for 23 years been the Director and CEO of the Millennium Project.

general intelligence, intelligence, jerome glenn, (13 more...)

#artificialintelligence

Country:

Asia > Middle East > Kuwait (0.15)
North America > United States (0.05)

Genre: Personal > Interview (0.65)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

Voices in AI – Bonus: A Conversation with Hilary Mason

#artificialintelligenceSep-24-2019, 02:37:15 GMT

Today's leading minds talk AI with host Byron Reese Listen to this episode or read the full transcript at www.VoicesinAI.com Byron Reese: This is Voices in AI, brought to you by Gigaom and I am Byron Reese. Today, our guest is Hilary Mason. She is the GM of Machine Learning at Cloudera, and the founder and CEO of Fast Forward Labs, and the Data Scientist in residence at Accel Partners, and a member of the Board of Directors at the Anita Borg Institute for Women in Technology, and the co-founder of hackNY.org. That's as far down as it would let me read in her LinkedIn profile, but I've a feeling if I'd clicked that'More' button, there would be a lot more.

general intelligence, hilary mason, intelligence, (10 more...)

#artificialintelligence

Genre: Personal > Interview (0.40)

Technology:

Information Technology > Communications > Social Media (0.55)
Information Technology > Artificial Intelligence > Cognitive Science (0.38)

Add feedback

Voices in AI – Bonus: A Conversation with Hilary Mason

#artificialintelligenceSep-24-2019, 02:37:14 GMT

Today's leading minds talk AI with host Byron Reese Listen to this episode or read the full transcript at www.VoicesinAI.com Byron Reese: This is Voices in AI, brought to you by Gigaom and I am Byron Reese. Today, our guest is Hilary Mason. She is the GM of Machine Learning at Cloudera, and the founder and CEO of Fast Forward Labs, and the Data Scientist in residence at Accel Partners, and a member of the Board of Directors at the Anita Borg Institute for Women in Technology, and the co-founder of hackNY.org. That's as far down as it would let me read in her LinkedIn profile, but I've a feeling if I'd clicked that'More' button, there would be a lot more.

general intelligence, hilary mason, intelligence, (10 more...)

#artificialintelligence

Genre: Personal > Interview (0.40)

Technology:

Information Technology > Communications > Social Media (0.55)
Information Technology > Artificial Intelligence > Cognitive Science (0.38)

Add feedback

Voices in AI – Episode 94: A Conversation with Amy Webb

#artificialintelligenceSep-2-2019, 16:53:29 GMT

Today's leading minds talk AI with host Byron Reese Episode 94 of Voices in AI features Byron speaking with fellow futurist and author Amy Webb on the nature of artificial intelligence and the morality and ethics tied to its study. Listen to this episode or read the full transcript at www.VoicesinAI.com Byron Reese: This is Voices in AI brought to you by Gigaom, and I'm Byron Reese. My guest is Amy Webb. She is a quantitative futurist.

artificial intelligence, intelligence, machine learning, (14 more...)

#artificialintelligence

Genre: Personal > Interview (0.86)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.31)

Add feedback

Voices in AI – Episode 77: A Conversation with Nicholas Thompson

#artificialintelligenceJan-12-2019, 16:18:03 GMT

Today's leading minds talk AI with host Byron Reese Nicholas Thompson is the editor in chief of WIRED magazine, contributing editor at CBS, co-founder of The Atavist and also worked at The New Yorker and authored a Cold War era biography. Byron Reese: This is Voices in AI, brought to you by GigaOm, I'm Byron Reese. Today my guest is Nicholas Thompson. He is the editor in chief of WIRED magazine. He's also a contributing editor at CBS which means you've probably seen him on the air talking about tech stories and trends.

artificial intelligence, nest thermometer, nichola thompson, (12 more...)

#artificialintelligence

Country: North America > United States > New York (0.25)

Genre: Personal > Interview (0.65)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback